What is domhandler?
The domhandler npm package is a backend module used to handle and manipulate HTML and XML documents. It provides a way to build a DOM (Document Object Model) from HTML/XML strings, which can then be manipulated or queried programmatically. This is particularly useful for server-side applications where you need to parse and interact with HTML/XML content.
What are domhandler's main functionalities?
Building DOM from HTML
This code demonstrates how to use domhandler to parse an HTML string into a DOM structure. The `DomHandler` is used in conjunction with `htmlparser2` to parse the HTML and build the DOM.
const { parseDocument } = require('htmlparser2');
const { DomHandler } = require('domhandler');
const html = '<div><p>Hello World</p></div>';
const handler = new DomHandler((error, dom) => {
if (error) {
console.error(error);
} else {
console.log(dom);
}
});
const parser = new parseDocument(handler);
parser.write(html);
parser.end();
Manipulating DOM
This example shows how to manipulate the DOM after parsing. It changes the text inside a <p> tag from 'Hello World' to 'Hello DOMHandler'.
const { DomHandler } = require('domhandler');
const { parseDocument } = require('htmlparser2');
const html = '<div><p>Hello World</p></div>';
const handler = new DomHandler((error, dom) => {
if (!error) {
const pElement = dom[0].children[0];
pElement.firstChild.data = 'Hello DOMHandler';
console.log(pElement);
}
});
const parser = new parseDocument(handler);
parser.write(html);
parser.end();
Other packages similar to domhandler
cheerio
Cheerio is a fast, flexible, and lean implementation of core jQuery designed specifically for the server. It uses a very similar approach to domhandler but provides a jQuery-like API for manipulating the DOM, making it more familiar to those who have used jQuery. Unlike domhandler, which is more low-level, cheerio abstracts many of the complexities involved in DOM manipulation.
jsdom
jsdom is another popular npm package that allows you to create a web browser environment from Node.js. It simulates a web page by creating a realistic document structure. While domhandler is primarily used for handling and manipulating DOM elements, jsdom provides a more comprehensive simulation of a web environment, including scripting and event capabilities.
domhandler ![Build Status](https://travis-ci.org/fb55/domhandler.svg?branch=master)
The DOM handler (formally known as DefaultHandler) creates a tree containing all nodes of a page. The tree may be manipulated using the domutils library.
Usage
var handler = new DomHandler([ <func> callback(err, dom), ] [ <obj> options ]);
Available options are described below.
Example
var htmlparser = require("htmlparser2");
var rawHtml = "Xyz <script language= javascript>var foo = '<<bar>>';< / script><!--<!-- Waah! -- -->";
var handler = new htmlparser.DomHandler(function (error, dom) {
if (error)
[...do something for errors...]
else
[...parsing done, do something...]
console.log(dom);
});
var parser = new htmlparser.Parser(handler);
parser.write(rawHtml);
parser.end();
Output:
[{
data: 'Xyz ',
type: 'text'
}, {
type: 'script',
name: 'script',
attribs: {
language: 'javascript'
},
children: [{
data: 'var foo = \'<bar>\';<',
type: 'text'
}]
}, {
data: '<!-- Waah! -- ',
type: 'comment'
}]
Option: normalizeWhitespace
Indicates whether the whitespace in text nodes should be normalized (= all whitespace should be replaced with single spaces). The default value is "false".
The following HTML will be used:
<font>
<br>this is the text
<font>
Example: true
[{
type: 'tag',
name: 'font',
children: [{
data: ' ',
type: 'text'
}, {
type: 'tag',
name: 'br'
}, {
data: 'this is the text ',
type: 'text'
}, {
type: 'tag',
name: 'font'
}]
}]
Example: false
[{
type: 'tag',
name: 'font',
children: [{
data: '\n\t',
type: 'text'
}, {
type: 'tag',
name: 'br'
}, {
data: 'this is the text\n',
type: 'text'
}, {
type: 'tag',
name: 'font'
}]
}]
Option: withDomLvl1
Adds DOM level 1 properties to all elements.
Option: withStartIndices
Indicates whether a startIndex
property will be added to nodes. When the parser is used in a non-streaming fashion, startIndex
is an integer indicating the position of the start of the node in the document. The default value is "false".
Option: withEndIndices
Indicates whether a endIndex
property will be added to nodes. When the parser is used in a non-streaming fashion, endIndex
is an integer indicating the position of the end of the node in the document. The default value is "false".